COVID-19 is the disease caused by SARS-CoV-2, the coronavirus that emerged in December 2019. COVID-19 can be severe, and has caused millions of deaths around the world as well as lasting health problems in some who have survived the illness. The coronavirus can be spread from person to person. It is diagnosed with a test.
Two years after the break out of the corona virus, the growth of COVID comfirmation rate seems to slow down, providing us the best timing to examine the pendemic as a whole. Here We would like to look at COVID in United states, and Maryland in specific, and discuss what the data illustrates us.
The data collection stage is very important. Without proper data to work with, no analysis can be done. Make sure to find credible and recent data to create accurate models and analysis.
In this project the Covid-19 data we used comes from Johns Hopkins University and is available at this link: https://github.com/CSSEGISandData/COVID-19
We used the following tools to collect this data: pandas, numpy, matplotlib, scikit-learn, seaborn, and more.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn
import warnings
import os
import folium
warnings.filterwarnings('ignore')
world = pd.read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv", sep=',')
world.drop(columns = 'Province/State', inplace = True)
world
Confirmed cases in each state of the United States
us = pd.read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_US.csv", sep=',')
MD = us[us["Province_State"] == "Maryland"]
frames = [MD]
data = MD.drop(us.columns[0:11], axis=1)
data = data.append(data.sum(numeric_only=True), ignore_index=True)
data.drop(data.index[0:26], inplace=True)
list1 = ["New York", "Florida", "Nebraska", "Kansas", "Washington", "California"]
for x in list1:
state = us[us["Province_State"] == x]
frames.append(state)
time = state.drop(state.columns[0:11], axis=1)
sum = time.append(time.sum(numeric_only=True), ignore_index=True)
data = data.append(sum.sum(numeric_only=True), ignore_index=True)
data = data.rename(index={0: 'Maryland', 1: 'New York', 2: 'Florida', 3: 'Nebraska', 4: 'Kansas', 5: 'Washington', 6: 'California'})
result = pd.concat(frames)
data
Number of deaths by state in the US
us_dead = pd.read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_US.csv", sep=',')
MD2 = us_dead[us_dead["Province_State"] == "Maryland"]
data2 = MD2.drop(MD2.columns[0:12], axis=1)
data2 = data2.append(data2.sum(numeric_only=True), ignore_index=True)
data2.drop(data2.index[0:26], inplace=True)
for x in list1:
state2 = us_dead[us_dead["Province_State"] == x]
time2 = state2.drop(state2.columns[0:12], axis=1)
sum2 = time2.append(time2.sum(numeric_only=True), ignore_index=True)
data2 = data2.append(sum2.sum(numeric_only=True), ignore_index=True)
data2 = data2.rename(index={0: 'Maryland', 1: 'New York', 2: 'Florida', 3: 'Nebraska', 4: 'Kansas', 5: 'Washington', 6: 'California'})
data2
确诊趋势
data = data.swapaxes("index", "columns")
data2 = data2.swapaxes("index", "columns")
data.plot()
死亡数趋势
data2.plot()
画个地图 no yes
result = result.drop(us.columns[[0,1,2,3,4,7,10]], axis=1)
melted = pd.melt(result, ['Admin2','Province_State', 'Lat', 'Long_'], var_name="Date", value_name='Cases')
melted = melted.drop(columns=['Province_State'])
melted = melted.rename(columns={'Admin2': 'Admin', 'Long_': 'Long'})
melted["Date"] = pd.to_datetime(melted['Date'])
melted = melted.groupby(['Admin', 'Date']).sum()
melted["Next_day"] = melted['Cases'].shift(fill_value=0)
melted["Daily_change"]= melted['Cases'] - melted['Next_day']
melted = melted.drop(columns=['Next_day'])
melted = melted.reset_index()
melted = melted[melted["Daily_change"] >= 0]
import plotly.express as px
df = melted
df["Date"] = df["Date"].astype(str)
fig = px.scatter_geo(df, lat="Lat", lon="Long",
hover_name="Admin", size="Cases",size_max=80,
animation_frame="Date",
scope = "usa",
title = "Total Cases")
fig.layout.updatemenus[0].buttons[0].args[1]["frame"]["duration"] = 100
fig.show()